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Abstract 

We extend Robust Optimization to fractional programming, where both the objec¬ 
tive and the constraints contain uncertain parameters. Earlier work did not consider 
uncertainty in both the objective and the constraints, or did not use Robust Optimiza¬ 
tion. Our contribution is threefold. First, we provide conditions to guarantee that 
either a globally optimal solution, or a sequence converging to the globally optimal 
solution, can be found by solving one or more convex optimization problems. Sec¬ 
ond, we identify two cases for which an exact solution can be obtained by solving a 
single optimization problem: (1) when uncertainty in the numerator is independent 
from the uncertainty in the denominator, and (2) when the denominator does not con¬ 
tain an optimization variable. Third, we show that the general problem can be solved 
with an (iterative) root finding method. The results are demonstrated on a return- 
on-investment maximization problem, data envelopment analysis, and mean-variance 
optimization. We find that the robust optimal solution is only slightly more robust 
than the nominal solution. As a side-result, we use Robust Optimization to show that 
two existing methods for solving fractional programs are dual to each other. 


1 Introduction 

A fractional program (FP) is an optimization problem, where the objective is a fraction of two 
functions. It can be used for an economical trade-off such as maximizing return/investment, 
maximizing return/risk or minimizing cost/time (Schaible and Ibaraki, 1983). A comprehen¬ 
sive overview of FP papers, containing over 550 references which include many applications, 
is given by Schaible (1982). More up-to-date references can be found in (Stancu-Minasian, 
2013), which also refers to six preceding bibliographies by the same author. 

More often than not, the parameters in an optimization problem are affected by uncer¬ 
tainty. Robust Optimization (RO) is about solving optimization problems with uncertain 
data in a computationally tractable way (see, e.g., Ben-Tal et ah (2009); Bertsimas et al. 
(2011a)). The key concept is that a solution has to be feasible for all realizations of the 
uncertain parameters, which are assumed to reside in convex uncertainty regions. 
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Sometimes, the objective is the maximum of hnitely many fractions, and the feasible 
region is a convex set. This is called a generalized FP (see, e.g., Barros et ah (1996); 
Crouzeix and Ferland (1991) for solution methods). A generalized FP with inhnitely many 
fractions in the objective was solved by Lin and Sheu (2005) using a cutting plane method, 
that uses a set of finitely many fractions that is extended in each step. They do not men¬ 
tion that their method can be used to deal with uncertain data, and do not use existing 
results from RO. Our work can be seen as an alternative approach, where we also deal with 
uncertainty in the constraints. 

The Lagrange dual of a robust FP was studied by Jeyakumar et al. (2013), extending a 
result by Beck and Ben-Tal (2009). They assume that the uncertainty in the numerator of 
the objective is independent from the uncertainty in the denominator. The dual is tractable 
when the numerator, denominator and constraints are linear, and the uncertainty regions 
are ellipsoidal or hnite sets of scenarios. In this paper, we focus on the primal problem. 
Nevertheless, in Section 4.2, we obtain and extend the list of tractable duals. 

The aim of this paper is to combine FP and RO, to provide a comprehensive overview 
of the solution methods, and to investigate the improvement of RO on numerical examples. 
First, we provide conditions that guarantee that a globally optimal solution, or a sequence 
that converges to the globally optimal solution can be found by solving one or more convex 
problems. The importance of these conditions is illustrated with a numerical example from 
literature. Second, we identify two cases for which an exact solution can be obtained by 
solving a single optimization problem. Third, we show that the general problem can be 
solved with an (iterative) root hnding method. 

In Section 2, we outline two existing solution methods for FPs, and present a new result 
showing that the two approaches are each others duals. In Section 3, we present existing 
results in RO, that will be used for FPs as well. Our main contribution is given in Section 
4. The results are demonstrated on a retnrn-on-investment maximization problem, data 
envelopment analysis, and mean-variance optimization in Section 5. 


2 Solving Nonrobust Fractional Programs 


In this section, we present two existing methods to solve FPs, and show that these methods 
are dnal to each other. To the best of onr knowledge, this dnality resnlt is new. Consider 
the following general formulation of an FP: 


(FP) min 

a:€K" 


/N 

9{x) 


s.t. hi{x) < 0, Vi e I. 


We will assnme that the constraint index set / is hnite, that / is convex and non-negative 
and that g is concave and positive over the feasible region. When the fnnctions /, g and hi 
are affine, (FP) is a linear fractional program: 


(LFP) min 

kGR" 


bo + X 
Cq + X 


s.t. doi + di X < 0, 


i e I. 
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Charnes and Cooper show that (LFP) can be reformulated as an (equivalent) LP, by making 
the substitutions y = x/{cq + c x) and t = l/(co + c x) (Charnes and Cooper, 1962): 

(CC-LFP) min bot + b y s.t. doit + dJw < 0, Vi G /, Cot + y = 1. 

tGR+,3/GR" 

An optimal solution of (LFP) is obtained from an optimal solution of (CC-LFP) by comput¬ 
ing X = y/t. Schaible (1974) shows that a similar substitution {y = x/g{x), t = l/g{x)) 
transforms (FP) into an equivalent convex programming problem: 

(Schaible-FP) ^ < 0, Vi G I. 

This is indeed a convex problem, since the perspective function p{y,t) := tf {y/t) is jointly 
convex on x R+ when / is convex on R”. Furthermore, an optimal x is obtained from 
X = y/t. Schaible also shows that, if the constraint tg{y/t) > 1 is formulated as an equality 
constraint: tg{y/t) = 1, it is not necessary to require / to be positive (Schaible, 1974). 

A different solution approach uses the auxiliary parameterized optimization problem 
F{a), dehned as mina,g]Rn{/(a:;) — ag{x) : hi{x) <0 Vi G /}. The objective value of (FP) is 
at least a if and only if F{a) > 0 (Dinkelbach, 1967). So, the objective of (FP) equals the 
largest a such that F{a) > 0: 

(P-FP) maxio; : min{/(a3) — ag{x)} > 0}, 

«€R+ xGX 

where X denotes the feasible region {x G R” : hi{x) < 0, Vi G /}. The usual way of solving 
this problem is by hnding the root of F, since the corresponding x is optimal for (FP) 
(Dinkelbach, 1967). This is usually done with a Newton-like algorithm, where there is some 
freedom in choosing the next iteration point (Chen et ah, 2009). The root of F is unique, 
since F is monotonically decreasing in a. The parameteric program F{a) is convex when 
g is affine or when / is non-negative on the feasible region (since then, only non-negative 
values for a needs to be considered). For these cases, the Newton method to hnd the root of 
F was described by Dinkelbach (1967), which creates a monotonically decreasing sequence, 
that converges superlinearly and often (local) quadratically to a root of F (Schaible, 1976). 

We now show that these approaches are dual to each other when / is non-negative. The 
proof for affine g and possibly negative / is similar. 

Theorem 2.1 Assume that / is convex and non-negative on X, g is concave on X, X 
is closed and convex, and the optimal value of (FP) is attained. Then (Schaible-FP) and 
(P-FP) are dual to each other, and strong duality holds. 

Proof. First note that the following reformulation is equivalent to (P-FP): 

(RP-FP) max{Q; : f{x) — ag{x) > 0, Va; G X}. 

o€R+ 

The remainder of the proof is based on the theory “primal worst is dual best” introduced 
by Beck and Ben-Tal (2009). They assume that X is compact and convex. Since we have 
assumed X to be closed and convex and since an optimal solution of (FP) is attained, 
compactness can be achieved by intersecting X with a box that includes the optimal solution. 
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Additionally, Beck and Ben-Tal assume that the constraint in (RP-FP) is convex in a and 
concave in x, which indeed holds. For fixed a?, (RP-FP) is an LP with the following dual; 

(D-LP) min{t/(a;) : tg{x) > 1}. 

While (RP-FP) is robust since the constraint has to hold for all x in X, the constraint in 
the optimistic counterpart of (D-LP) has to hold for a single x: 

(OD-LP) min {tf{x) : tg{x) > 1}. 

tGK+,xGX 

(RP-FP) and (OD-LP) are dual to each other (Beck and Ben-Tal, 2009). Strong duality 
holds, since {x, t) is a Slater point for (OD-LP) for sufficiently large t. It is obvious that 
(OD-LP) and (Schaible-FP) are equivalent, since t = 0 is infeasible for (OD-LP). ■ 


3 Robust Optimization 

There are currently two generic methods that deal with an infinite number of constraints. 
The first method is applicable to both linear and nonlinear constraints, while the second 
method can only be applied to robust LPs. 

The first (“constraint wise”) approach uses conic duality (Ben-Tal et ah, 2009) or Fenchel 
duality (Ben-Tal et ah, 2015). The vector x in M” satisfies the following infinite number of 
constraints: 


hi{ai, x) < 0, Va* G ; Tij{ai) < 0, Vj G J, 


if and only if there exists Uij G M+ and Vij G (j E J, J being a finite set), such that x 
satisfies the following convex constraint: 


E 




i-J 




o. 




< 0 , 


( 1 ) 


where r* (s) = sup„.gjji{s^ai - Tij{ai)} and {hi)^{s,x) = inf„.eRL{s^aj - hi{ai,x)} are 
the convex and concave conjugates of and hi, respectively. This approach requires the 
constraint to be concave in a*, the functions to be convex, and ri(dom hi{-,x)) fl Ti{Ui) 
^ 0 for all X G where Ui := {a* G : Tij{ai) < 0, Vj G J}. This approach yields a 
tractable formulation for many constraints and many uncertainty sets (see Tables 1 and 2 in 
(Ben-Tal et ah, 2015)), even if the conjugates do not have closed-form expressions. To give 
an impression of the broad applicability of this method, let us cite some examples, for which 
it provides a tractable reformulation. For uncertainty sets, one could have a norm-bounded 
(e.g., box or ball), polyhedral or conic representable set, or a generic set defined by (convex) 
power functions, exponential functions, negative logarithms, or any function for which the 
convex conjugate exists. Constraints could be linear or quadratic in the uncertain parameter. 
There are some operations on functions that preserve the availability of a tractable expression 
of the conjugate. One is multiplication with a non-negative scalar: the concave conjugate of 
tf(ao,y/t) (with respect to ao) for t > 0, is the perspective of the concave conjugate of /: 
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tf\{s/t,y/t). Another one is when hi is the sum of two functions: hi = hn + hi 2 . Suppose 
closed-form conjugates exist for hn and hi 2 separately; then: 

(hi)*(s,a3)= max {(hii)*(si, a?) + (hi 2 )*(s 2 , a?) : Si + ^2 = «} • (2) 

sieM^,S2GK^ 

When substituting (2) into (1), the max operator in (2) may be omitted, since if the resulting 
constraint holds for some Si and S 2 , then it surely holds for the maximum. This example 
shows that a closed-form expression for the conjugate is indeed not always required. 

The second method solves any robust LP with a convex uncertainty region via its La¬ 
grange dual (Gorissen et ah, 2014). The transformation from the primal to the dual is a 
three step procedure. First, the dual of the nonrobust LP is formulated, where the uncertain 
parameters are assumed to be known. Second, instead of enforcing the constraints for all 
realizations of the uncertain parameters (“robust counterpart”), the constraints of the dual 
have to hold for a single realization of the uncertain parameters (“optimistic counterpart”). 
So, the uncertain parameters are added to the set of optimization variables. The last step is 
to reformulate the nonconvex optimistic counterpart to an equivalent convex optimization 
problem. The optimal solution of the resulting problem can be translated to an optimal 
solution of the original robust LP via the KKT vector. 

In the remainder, we provide reformulations based on (1), but the reader should be aware 
that the other approach may be useful when all functions involved are linear. 

4 Solving Robust Fractional Programs 

In this section, we show how to solve (R-FP). It is our aim to obtain Robust Counterparts 
(RCs), that can be solved with existing Robust Optimization methods. First, we formu¬ 
late conditions that give raise to convex optimization problems. Under these conditions, 
a globally optimal solution can be found by solving a single convex optimization problem 
(Sections 4.2 and 4.3). These conditions also guarantee that, in the general case (Section 
4.4), a root hnding method produces a sequence of convex optimization problems, whose 
solutions converge to a globally optimal solution. The results of this section are summarized 
in Tables 1-3. 

4.1 Robust Formulation and Assumptions 

The uncertain parameters, denoted by cij, are assumed to he in sets Ui C which we 
dehne using functions Tij : —)• M: 

Ui := {ai e : Tij{ai) < 0, Vj G J}, (3) 

where J is a hnite set. In the RC of (FP), the constraints have to be satished by all 
realizations of the uncertain parameters: 

(R-FP) min (a) s.t. —1 — « < 0, Vao £ Uq 

aGR,a;eK" g{ao,x) 

hi{ai,x) < 0, Vaj G Ui, Vi G I. (4) 
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Note that the uncertainty is specihed constraint-wise, which is possible even if the parameters 
in different constraints are correlated (Ben-Tal et al., 2009, p. 12). We make the following 
assnmptions: 

(a) Tij are convex and the sets W* are convex and compact, 

(b) / and hi are convex in x for every hxed valne of a* in Ui, 

(c) g is concave in x for every hxed valne of aj in Ui, 

(d) / and hi are concave in for every feasible x, 

(e) g is convex in for every feasible x, 

(f) g{ao, a;) > 0 for every ao in Uq and every feasible x, and 

(g) /(ao, a;) > 0 for at least one ao G Uq and for every feasible x. 

The last assnmption is not necessary if g is biaffine, i.e., when g is affine in each parameter 
when the other parameter is hxed, of which we show the conseqnences in Section 4.5. In 
robust linear programming, the assnmption that Ui is compact and convex is made withont 
any loss of generality (Ben-Tal et al., 2009, p. 12). For robust FP, compactness is not a 
restriction since the fnnctions hi are continnons and the constraints do not contain strict 
ineqnalities. So, the problem remains nnchanged if the nncertainty region is replaced with 
its closure. However, requiring Ui to be convex is a restriction (nnless /, g and h are affine 
in aj), that is necessary for nsing existing results in RO. 

Assnmptions (d) and (e) are made solely because they are required by generic methods 
to derive a tractable RC. There are some examples where the RC can be derived even 
thongh these conditions are not fnlhlled, e.g., if the nncertainty region is the convex hnll 
of a limited nnmber of points and the constraint is convex in the nncertain parameter, for 
a conic quadratic program with implementation error, or when the iS-lemma or a snms of 
sqnares result can be applied (Ben-Tal et al., 2009, 2015; Ben-Tal and den Hertog, 2014; 
Bertsimas et al., 2011b). In these cases, assnmptions (d) and (e) are not necessary. 

In literatnre, a problem is solved, that does not satisfy these reqnirements (Lin and Shen, 
2005). While the anthors claim that this did not affect their compntations, and that they 
hnd the global optimnm, in Appendix A we show that their solntion is snboptimal. 

4.2 Special Case: Uncertainty in the Numerator is Independent 
of the Uncertainty in the Denominator 

Snppose that the nncertainty in the nnmerator of the objective is deconpled from the nncer¬ 
tainty in the denominator: 

(R-Sl) min (a) s.t. — I — a < 0, Vao G Uq, Vag G Un 

aeR+,a:eR’* g(a'(„ x) 

hi(ai, x) < 0, Vaj G Ui, Vf G I. 
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We claim that (R-Sl) is equivalent to the RC of the Schaible reformulation: 


(R-Schaible-Sl) 


min 

oeIR+,tGK++,3/eR" 


(a) s.t. 


tf (^ao, -a<0, Vao e Uq 

tg (^a'o, > 1, Va'o G 

thi (^ai, < 0, Va* G W*, Vi G I. 


( 5 ) 


Clearly, an optimal solution of (R-Schaible-Sl) exists, for which t = sup^^g^^ 1/g{a'Q,y/t). 
Equivalence between (R-Sl) and (R-Schaible-Sl) readily follows from the substition x = y/t, 
that converts a feasible solution of one problem to a feasible solution of the other problem. 

This result extends (Jeyakumar et ah, 2013). They provide the dual of (R-Sl), and 
show that strong duality holds. In case /, g and hi are linear, they show that the dual 
of (R-Sl) is tractable, when the uncertainty region is ellipsoidal, or consists of a finite set 
of scenarios. The resulting problems can also be obtained from our work, by applying the 
solution method by Gorissen et al. (2014) to (R-Schaible-Sl). In addition to ellipsoids or 
scenarios, our method works with any convex uncertainty region, such as a polyhedral set 
or a conic quadratic representable set. A similar result was found by Kaul et al. (1986), but 
that result is wrong (see Appendix B). 

We provide a reformulation of (R-Schaible-Sl) if the RO method using conjugates is used 
(eq. (1)). The resulting equivalent problem becomes: 


mm a 


s.t. I] 1 77^ I - t/. 

3<^J 


Voj\ y 


j&J 


'OR'i 


Uoj 

, Wo'7 


-„<o 


- tg. 




\ t 




^ UijT*j ^ - t{h. 




i&J 


u 


o, 


1 <0, WgJ 


a G M+, t G M++, u G ^ ^ ]^(hl+ 2 )x|J|xL^ ^ ^ 


4.3 Special Case: the Denominator Does Not Depend on the Op¬ 
timization Variable x 

If the optimization variables do not appear in the denominator, (R-FP) is equivalent to (cf. 
Ben-Tal et ah, 2015, Ex. 30): 

(R-S2) min (a) s.t. f{ao,x) — ag{ao) < 0, Vao G Wo 

hj(aj, x) < 0, Vttj G Wj, Vi G I. 


Note that g indeed does not depend on x. (R-S2) can be solved via the following equivalent 
convex reformulation using (1), and standard techniques for the conjugate of the sum of two 
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functions: 


min (a) s.t. tojToj - /* H ^oj, a;j + ag* < 0 

E - (^0* ( H Vij, a; I < 0, Vi e / 

jGj \ / \j£j J 

aeR+,te Rf'+^^^'‘^1, seR^,ve m(I^I+Vx|j|xl^ ^ ^ ^n_ 

4.4 General Case 

We now show how to solve the general problem (R-FP) using the following parametric 
problem: 

F(q;) := min (w) s.t. f{ao,x) — ag{ao,x)<w,yaoEUo (6) 

hj(aj, x) < 0, Vaj G Wj, Vi G /, 


which is a convex optimization problem, since we only have to solve it for a G R+. Let a* 
be a root of F. Lin and Sheu show that an optimal solution of (R-FP) is the minimizer 
X of F{a*), if the feasible region for x is compact (Lin and Sheu, 2005). We assume from 
now on that the constraint functions hi dehne a compact feasible region. Moreover, they 
show that F{a) < 0 if and only if a > a*. Lin and Sheu do not use results from RO to 
arrive at the deterministic reformulation (7). Instead, they replace the set Uq with a hnite 
set, approximate F{a) with an entropic regularization method, and iteratively generate a 
sequence dk that converges to a*. The approximation becomes more accurate as the root 
of F is approached. The reason why they approximate F{a) is because they claim that 
computing its value is difficult. Our approach is to solve F{a) using RO, which inherently 
produces tractable problems. The following convex reformulation using (1) is equivalent: 

F{a) = mm (w) s.t. - f* \ ^ + x\ + ag* x'] < w (7) 

jej \'^oj J \ j(zj J \a / 

I < 0, Vi G / 

jej \ W / \jej j 

t G s G R^, u G M(lh+i)x|J|xL^ w G R, a; G R”. 

Since F is monotonically decreasing in a, as for FPs and generalized FPs, existing root- 
hnding methods can be used. We mention a few of these that produce a sequence {ak} 
which converges to a*: 

(a) The bisection method. Bounds on the interval that contain a* are: 


aLB ■= mm{f{ao,x)/g{aQ,x) : hi{ai,x) < 0, Va* eUi, Vi G /} 

a;gR" 

auB ■= sup f{ao,x)/g{ao,x), 

ao €Uo 
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( 8 ) 

(9) 



where (8) is computed for a fixed d,o from Wq, and (9) is computed for some x that 
is (robust) feasible, i.e., for an x that satishes (4). These bounds can be computed 
relatively easily using the Schaible reformulation. If the lower bound (8) is hard to 
compute due to the “for all” quantiher in the constraints, it may be computed for hxed 
di from Ui- Since F^aiB) > 0 and F{auB) < 0, and since F is clearly nonincreasing, 
a* lies in [q:lb,C(ub]- The middle point of this interval is := 0.5{aLB + oiub)- 
By evaluating F{ak), the width of the interval that contains a* can be halved: if 
F{ak) > 0, then set a^B = ctfc, otherwise set auB = C(k- By increasing /c by 1 and 
repeating this procedure, a series {ctk} is constructed, that converges to a*. 

(b) The Dinkelbach type algorithm by Crouzeix et ah (1985), adjusted for inhnitely many 

ratios. The method starts with k = 0 and f{ao,x)/g{ao,x) for some 

feasible x. Then F{ak) is computed, with maximizer Xk- If F{ak) < 0, then the next 
a is determined by a^+i := f{aQ,Xk)/g{ao,Xk). Computing ak+i requires 

solving an FP. The method proceeds by increasing k by 1, and again computing F{ak)- 
If the feasible region for x is compact, then the series converges linearly to 

(c) The same as method (b), except that the right-hand side of (6) is multiplied with 
g{aQ, Xk)'. f{cLo, x) — ag{aQ, x) < wg{ao, Xk). This may increase the speed of conver¬ 
gence for the same complexity of computation (Crouzeix et ah, 1986). 

(d) The same as method (b) or (c), except that the that maximizes F{ak) is used to 
compute ttfc+i, instead of solving a new optimization problem. The worst case ao in 
the computation of F{ak) can be recovered without much computational effort. Thus, 
ttfc+i = /(ao, Xk)/g{ao, Xk) is computed more efficiently than in the method (b) or (c). 
Additional work is required to ensure convergence of {afcj to a* (Crouzeix and Ferland, 
1991, Section 5). 

Let Xk be the maximizer of F{ak). If a root finding method finds the root in a finite number 
of steps, then an exact solution of (R-FP) is found. Otherwise, Crouzeix et ah show that, if 
the sequence {ctk} converges to a*, then any convergent subsequence of {xk} converges to 
the optimal solution x* of (R-FP) (Crouzeix et ah, 1985, Theorem 4.1c). 

4.5 Consequences when the Denominator is BiafRne 

The assumption that the numerator is positive, ensures that the objective value of (R-FP) is 
positive over the feasible region. Consequently, we could assume a G M+; this would produce 
a convex optimization problems. If g is biaffine, then the resulting problems are also convex 
for a < 0. We shall discuss the results to each of the three aforementioned cases separately. 
For the first special case (Section 4.2), the restriction that the numerator is positive may be 
dropped only if the denominator does not contain an uncertain parameter. Then, (R-Sl) 
and (R-Schaible-Sl) are equivalent if a G M+ is replaced with a G M and (5) is stated as 
an equality (cf. Schaible (1974)). The reason why the denominator may not contain an 
uncertain parameter, is because t = 1/g{x,aoi) is not possible for multiple aoi. 

For the second special case (Section 4.3), the denominator only depends on Oq, so “bi¬ 
affine” in the title of this subsection should be read as “affine”. When (R-S2) is solved for 
a G M, the restriction that the numerator is positive, may be dropped. 
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For the general case (Section 4.4), no changes need to be made to drop the restriction 
that the numerator is positive. 

Table 1: Tractable cases when uncertainty in the numerator is independent 
of the uncertainty in the denominator, i denotes an affine function. 


/ 

9 

hi 

sgn(/) 

(R-Schaible-Sl) 

f(ao,x) 


hi{ai,x) 

> 0 

no modihcations 

f(ao,x) 


hi{ai,x) 

any 

a G M, (5) as an equality 


Table 2: Tractable cases when the denominator does not depend on x. i 
denotes an affine function. 


/ 

9 

hi 

sgn(/) 

(R-S2) 

f{ao,x) 

9{ao) 

hi{ai,x) 

> 0 

no modihcations 

fiao,x) 

^{ao) 

hi{ai,x) 

any 

a e M 


Table 3: Tractable cases for the general case, i denotes a biaffine function. 


/ 

9 

hi 

sgn(/) 

f{ao,x) 

g{ao,x) 

^ 2(^25 

> 0 

fiao,x) 

l{ao,x) 

hi{ai,x) 

any 


5 Numerical Examples 

In this section, we test our method on three examples: a multi-item newsvendor problem 
(Section 5.1), mean-variance optimization (Section 5.2), and data envelopment analysis (Sec¬ 
tion 5.3). 

5.1 Multi-item Newsvendor Example 

In Gorissen et al. (2014), a multi-item newsvendor problem is solved by minimizing the 
investment cost under the condition that at least a certain expected proht is made. 

We show how to directly optimize the expected return on investment for this example. Let 
us hrst recapitulate the problem. The newsvendor buys Qi units of item i at the beginning 
of the day. Each item has its associated ordering cost q, selling price Uj, salvage price r*, 
and unsatished demand loss /*. We assume rj < Vi + k. During the day the newsvendor faces 
a demand di, resulting in a proht of min{nj( 5 i + li{Qi — di) — CiQi, vA + ri{Qi — di) — CiQij. 
The demand is not known in advance, but there are hnitely many demand scenarios d^g (s 
in S) that occur with (uncertain) probability pis, independently of other items. 
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The problem of maximizing expected return on investment can be formulated as: 


(R-NV) 


max 

S.t. 


Pis'^is 

mm -—- 

Pi&Ji CiQi 

Uis + (Ci - Vi) Qi < dis {vi - Ti), \fi e I, Ws e S 
^is “t“ (Ci Vi Z^) Qi ^ disli^ G h, V-S G 


where Uis is the contribution to the proht of item i in scenario s, and the convex and compact 
uncertainty regions Ui are dehned using the Matusita distance, which is a 0-divergence 
measure (Ben-Tal et ah, 2013): 


Ui = Ipie M+' : ^Pis = \iPisT - < P, Vf G jL 

I sGS sGS ) 

Note that the assumptions (a)-(e) and (g) are always fulhlled, and that assumption (e) is 
fulhlled if at least one item is bought. (R-NV) can be classihed under the hrst special case 
(Section 4.2). Since all functions are affine and the denominator is certain, the Schaible 
reformulation and the Charnes-Cooper reformulation are equivalent. (R-NV) is therefore 
equivalent to: 

(R-CC-NV) 

max min EE Pis^is 

iGl sGS 

s.t. Uis + {Ci - Ti) Qi < dis {vi - Vi) t, yi e I, e S 
^is “1“ (c^ Vi Qi ^ dislit^ G h, V5 G S 

^ ) CiQi 1, 

iGl 

where Q and u in (R-CC-NV) have to be divided by t to obtain the Q and u in (R-NV). 
(R-CC-NV) is a linear program with a convex uncertainty region that we solve via its dual, 
as outlined in the introduction. The last of the reformulation steps, a substitution, is not 
necessary since the uncertainty only appears in the objective. Let Xj*, Pis and be the dual 
variables of (CC-NV); then the optimistic dual (OD-CC-NV) is given by: 


(OD-CC-NV) 


min 


s.t. 


^is T Vis Pis} ^ L, V 5 G S 

E{('^* “ + (c* i'i)yis} + CiZ > 0, Vf G / 

sGS 

EE disi^i Pi)^is ^is^iVis ^ 0 

seS 


'^Pis = 1, Vf G / 

sGS 


E <p,^i el 

sGS 

p £ Kf® e Ri'l’*'®', y £ 2 £ 


( 10 ) 

(11) 
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The optimal value of (OD-CC-NV) is the robust expected return on investment. The corre¬ 
sponding optimal order quantities Q can be derived from the KKT vector of (OD-CC-NV), 
by dividing its elements associated with (10) by the element corresponding to (11). This is 
the same as dividing Q by f in (R-CC-NV) to undo the Charnes-Cooper transformation. 

We solve the problem for the same data as Gorissen et al. (2014) with AIMMS 3.11 
(Paragon Decision Technology, the Netherlands) and KNITRO 7.0 (Zienna Optimization 
LLC, USA) with its default settings. Computation errors for negative Pis were avoided 
by using \pis\°‘ instead of {pis)°‘- We take p = 0 to obtain the nominal solution, while 
p = 0.03 for the robust solution. Solutions were obtained in less than 0.01 seconds. When 
the probabilities are as expected (p = 0), the expected return on investment of the nominal 
solution is 0.297, while for the robust solution it is 0.285. When p = 0.03 and the worst 
case probabilities occur for the nominal solution, i.e., the probabilities that minimize the 
expected return on investment for the nominal solution, the objective value drops to 0.211, 
while for the robust solution it drops to 0.214. So, the solution indeed becomes more robust, 
but the difference with the nominal solution is small. We verify if the decision maker could 
have done better, if he knew beforehand which probability vector realizes. This done by 
optimizing the nominal model (p = 0), while setting the probabilty estimates pis equal to 
the worst case probabilities for the robust solution. This gives the so-called perfect hindsight 
solution. The objective valne is as low as 0.214. So, even thongh the robust objective could 
deteriorate substantially, there is no other solntion that performs better. 

5.2 Mean-variance Optimization 

We are to present an example that involves a trade-off between mean and variance. This 
trade-off is commonly used in portfolio optimization, inclnding the Modern Portfolio Theory 
(MPT) fonnded by Markowitz (1952), where the goal is to select the right mix of assets. 
In contrast to MPT, we do not impose that the expected returns on the assets and the 
covariance matrix are fully known. Instead, we assnme that hnitely many scenarios s (in S) 
for the fnture can be identihed along with nnknown probabilities of occnrence p*, which are 
estimated by ps- The retnrn of asset i in scenario s is a constant r^g, so when Xi nnits of 
money are invested in asset i, the retnrn in scenario s is given by Ug = f'isXi (possibly 
negative). The expected retnrn and variance are given by: 


E(retnrn) := '^PgUg 



sGS 



Var(retnrn) ;= ^p* {ug 

— E(retnrn))^ 

(12) 

sGS 

/ \ 2 


= - 

ses 

VsGS / 

(13) 
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To remain in the minimization framework, the objective is to minimize the variance-to-mean 
ratio (or the dispersion index). The robust optimization problem is given by: 


(R-I) 


min 

s.t. 


EsesPsU'i - {EsGSPsUsf 

ni3,x - 

pew EsesPsUs 

Us = '^risXi, \fs e S 

iei 

^Xi = C 
iei 

^PsUs > 0, Vp G U. 

sGS 


The last two constraints ensure that C units of money are invested, and that this model 
has a feasible solution only if the expected proht is positive. The numerator is convex in 
Us (from (12)) and concave in ps (from (13)). The denominator is clearly concave in ps 
and convex in Us- Moreover, the numerator is non-negative and the denominator is positive 
on the feasible region. For the uncertainty region we use the modihed y^-distance as a 
0-divergence measure, which can be justihed by statistical theory (Ben-Tal et ah, 2013): 

Mo := (pe Ki?' : Ep. = 1. E kiyFl! < d , 

I tts Ps J 


(R-I) is not one of the special cases, so we solve this problem using the general method. In 
order to formulate (7) explicitly, we hrst derive some conjugate functions. The conjugate for 
/ is from Ben-Tal et al. (2015, Ex. 25). 


\ses 


f(P,u) := J^Ps'^s - 112 Ps 
seS 

9(P,u) := J2Ps^s 
seS 

Tii(p) := max{-p4 

F2(p) :=J2Ps-^ 

seS 

Tisip) ■=^-J 2 Ps 

sGS 

( \ (Ps Ps) 

R4(p) := X-^- 

sGS Ps 


Usj Mv, u) = sup{-— : ul + UsZ = Vs, Vs e S'} 


g [v,u) = 


^iiVV) = 


^i2VV) = 




- p 


0, if Vs = Us, \/s E S 
oo, otherwise 

0, if Us < 0, Vs G S' and J2sesUs > —1 
oo, otherwise 

1, if Us = 1, Vs G S' 
oo, otherwise 

— 1, if Us = —1, Vs G S' 
oo, otherwise 

1 

Qs [ 

s€S 


=p + Y,Ps {^Vl + Us^ 
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Plugging in these formulas in (7) yields F{a) 


min w s.t. 


((v Y \ 

to2 — io3 + pto4 + X! ^ ('^' 04)5 ) + -r < 

seS \ ^^04 / 4 

(t^oi)s < 0 (nii)s < 0, Vs e S' 

X]('^0l)s ^ —toi 

s€S s&S 

ul + UgZ = UgOi + (noi)s + to2 — to3 + (^04)5, \/s E S 
tl2 — tl3 + ptlA + ^ Ps ( ^ . + ('^ 14 ) 5 ] < 0 

sGS V 4fl4 / 

Us + + ti2 ~ tl3 + ('i^l4)s = 0, Vs G S' 

Us = '^TisXi, WS E S 

i&I 

J 2 ^i=C 

i£l 

t E Rf^,uE e G R, a; G RI^I,;^ G R. 


(14) 


(15) 


This problem is not convex because of the product UgZ in (15). Similar to Yamkoglu et ah 
(2013, Theorem 1) the problem can be made convex by replacing to 2 ~ ^03 in (14) with 
+ UgZ — UgOi — (noi)s ~ (^ 04 )* and omitting (15) from the problem. Constraint (14) then 
becomes: 


u 


z 

s + 7T 


Uga — (noi)s ~ (^04)5 + ptoi 



< w, \/s E S', (16) 


which is jointly convex in all variables. In order to improve the tractability and accuracy of 
computing F{a), we cast it as a conic quadratic problem. The only complicating terms are 
{vqYI/ ( 4 ^ 04 ), which can be reformulated using a standard trick. Constraint (16) is satished if 
and only if there exists auxiliary variables Ug such that the following inequalities are satished: 



(noi)s 


(^'04)s + pto4 + 2^pg, {yg> + (no4)s') <w,^s E S 

s'es 


f ( 2 ^ 04)5 \ 
[vs - 4to4/ 


<ys + 4fo4, Vs G S'. 


The problem (R-I) can now be solved by determining the root of F. 

We perform a numerical analysis on 10 items and a generated data set of 50 scenarios. 
In order to incorporate correlations, we hrst construct a covariance matrix AA , where A 
is a 10 X 10 matrix whose entries are uniformly and independently distribnted on [—0.5, 0.5]. 
Then, to rehect the idea that a higher risk gives a higher expected return, a vector of expected 
retnrns is constrncted with a linear mapping on the variances of the items. The mapping 
is constructed such that the item with the smallest variance gets an expected retnrn of 0.01, 
and the item with the largest variance gets an expected retnrn of 0.20. Finally, the scenarios 
are drawn, each from a multivariate normal distribution with the constructed mean fi and 
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covariance AA^. We solve the model for p = 1, ps = 0.02 for all s, and C = 100 to obtain a 
robust solution using YALMIP (Lofberg, 2012) and MOSEK (Mosek ApS, Denmark) with 
their default settings. For this value of p, the probabilities can vary, on average, between 0 
and 0.04. Additionally, we solve the same problem for p = 0, i.e., when ps = Ps, to obtain a 
nonrobust solution. We use bisection search on the interval determined by (8) and (9) and 
stopped when the interval width was less than 10“^° . One step in bisection search takes 
around 2 seconds, of which around 7% is spent by MOSEK. 

The convergence of the bisection method turns out to be adequate. Let Xi denote the 
solution in iteration i, and let x* denote the hnal solution. The initial search interval is 
[0.70, 20.09]. The solution x^, obtained from solving F((0.70 + 20.09)/2), is far from optimal; 
||a3i — ~ 2.2. In each three or four iterations, Xi gains one extra digit of accuracy. 

After 22 iterations, the accuracy has improved to \ \x 22 — ~ 4.1 ■ 10“^. The algorithm 

terminates after 37 iterations, with no apparent improvement following the 22'^'^ iteration. 
Since ||a ?*||2 ~ 44.9, the error after 22 iterations is relatively small. 

When Ps = Ps for all s, the mean-variance ratio of the nominal solution (which is 6.34) 
is indeed lower than that of the robust solution (which is 6.45). For both solutions we 
determined the worst case p and the corresponding objective value. The objective of the 
robust solution (which is 18.62) is slightly better than that of the nominal solution (which is 
18.98). This shows that uncertainty may cause a factor three deterioration of the objective 
value. Relative to this large difference, the difference between the two solutions is small. So, 
the nominal solution performs quite well for this example. For the worst case probabilities 
for the robust solution, we have computed the optimal portfolio as if these solutions were 
known beforehand (perfect hindsight solution). The objective value equals that of the robust 
solution. So, even though the robust objective could deteriorate substantially, there is no 
other solution that performs better. 

5.3 Data Envelopment Analysis 

Data Envelopment Analysis (DEA) is a tool to estimate the efficiency of different decision 
making units (DMUs), based on their inputs and outputs. DEA was originally introduced 
for not-for-proht companies, e.g., schools where inputs could be number of teacher hours 
and number of students per class, and outputs could be arithmetic scores and psychological 
tests of student attitudes, e.g., toward the community (Charnes et ah, 1978). The applica¬ 
bility of DEA is not limited to nonproht organizations. A reference list of more than 4,000 
publications on DEA is given by Emrouznejad et al. (2008). 

Let rii and Uo denote the number of in- and outputs, respectively. The efficiency of a 
DMU is dehned as the largest fraction of weighted outputs divided by weighted inputs, given 
that the efficiency of the other DMUs is at most 1: 

(DEA) max ^— s.t. —— < 1, V* G /, 

V Xo V Xi 

where Xi and yi are the vectors of inputs and outputs of DMU i, and u and v are the 
non-negative weights. 

The inputs and outputs are model parameters that have to be acquired from each DMU 
and are affected by measurement errors. Especially when a single DMU represents a group 
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of smaller business units or is a pool of all activities in a certain region, and the inputs 
and outputs are aggregated, errors become practically inevitable. There have been many 
attempts to incorporate uncertainty in DEA. For an overview, e.g., see (Shokouhi et ah, 
2014). Since our focus is on RO, we only discuss the three papers that are relevant. The hrst 
only considers uncertain outputs (Sadjadi and Omrani, 2008). The second considers jointly 
uncertain inputs and outputs (Shokouhi et al., 2010). Unfortunately, the robust counterpart 
in the latter is constructed in an ad-hoc manner that results in a nonconvex formulation, 
for which it is not clear whether globally optimal solutions were found. The third considers 
either uncertain inputs or uncertain outputs (Wang and Wei, 2010). In the last two papers, 
a simulation study is performed to quantify the improvement offered by the robust solution. 
For each randomly drawn set of inputs and outputs, they compute the relative efficiencies 
with the u and v obtained from the robust solution. However, in our view, when the inputs 
and outputs are fully known, the relative efficiencies can only be computed by optimizing 
(DEA) for those known inputs and outputs. Our results are therefore different in two ways. 
First, we consider both uncertain inputs and uncertain outputs and solve the correct problem. 
Second, we perform a valid simulation study to verify whether the robust solution is better 
than the nominal solution. 

In this section, we take the data from Shokouhi et al. (2010). In this data set there are 
Eve DMUs, two inputs and two outputs. The in- and outputs are uncertain, but known to 
reside in given intervals, given in Table 4. 

In order to get in the minimization framework, the objective of (DEA) is replaced with 
its reciprocal. The optimal solution of the robust counterpart of (DEA) then corresponds to 
the reciprocal of the root of: 


F(a) 


min w s.t. 

jveK"® ,uieiR 


Xq — au yo < w, V(a3, y) G W, 
uyi < Xi, V(a3, y) EU,\/i E I. 


Following Shokouhi et al. (2010), we take the Bertsimas and Sim uncertainty region: 


U = {{x,y)e X M(lh+i)xn. . ^ ^ y^^ + 


nji 


|vec(C,C)IL<l, ||vec(r,C)||i<r}, 


where Xij and yij are the midpoints, Axij and At/ij are the half-widths of the uncertainty 
intervals, and the vec operator stacks the columns of the matrix arguments into a single 
vector. For robust LP, this set has the property that when T is integer, it controls the number 
of uncertain elements that can deviate from their nominal values (Bertsimas and Sim, 2004). 
This property also holds for a robust LFP, since F{a) is a robust LP. 

The optimal weights u and v depend on the actual inputs and outputs. One may there¬ 
fore be inclined to use Adjustable Robust Optimization (ARO) when P is larger than the 
dimensions of Xi and yi added. Consequently, u and v are replaced by functions of the 
uncertain parameters. Unfortunately, even in the simple case of affine decision rules, this is 
often intractable. In the constraints of F{a), u and v are multiplied with uncertain param¬ 
eters, which yields constraints that are quadratic in the uncertain parameters. These can 
currently only be solved efficiently for ellipsoidal uncertainty sets. 
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Table 4: Data set for the DEA example of Section 5.3. 


DMUj Input 1 

Input 2 

Output 1 

Output 2 

1 

[14,15] 

[0.06, 0.09] 

[157, 161] 

[28, 40] 

2 

[4,12] 

[0.16, 0.35] 

[157, 198] 

[21, 29] 

3 

[10,17] 

[0.10, 0.70] 

[143, 159] 

[28, 35] 

4 

[12,15] 

[0.21, 0.48] 

[138, 144] 

[21, 22] 

5 

[19,22] 

[0.12, 0.19] 

[158, 181] 

[21, 25] 


We use bisection search on the interval determined by (8) and (9) to determine the root 
of F{a), and stop when the interval width is less than 10“^ (which turns out to be accurate 
enough for ranking the DMUs). F{a) is computed using YALMIP and MOSEK with their 
default settings, and takes a few tenths of a second on a normal desktop computer, where 
MOSEK accounts for approximately 10% of that time. The time it takes to compute F{a) 
turns out to be approximately constant, so independent of the remaining width of the interval 
and independent of the size of the uncertainty region T. The root of F{a) is determined in 
a few seconds. 

We computed the robust efficiencies of the DMUs for T ranging between 0 and 4 in steps 
of 0.1, since each constraint has at most four uncertain parameters. For T < 0.2, the list 
of DMUs ranked from most to least efficient, is 1, 2, 3, 5, 4. For T > 0.3, DMUs 3 and 5 
switch positions. Hence, DMU 5 is more efficient than DMU 3 when T > 0.3. We have tried 
to verify this claim by running 100 simulations, where in each simulation we uniformly drew 
inputs and outputs from the uncertainty region, solved (DEA) for each set of inputs and 
outputs, and ranked the DMUs based on efficiency. In 76 out of 100 simulations, DMU 3 
was more efficient than DMU 5. This result advocates against the use of RO in DEA, since 
it shows that for T = 0 (i.e., the nonrobust solution), the ranking is better than for T > 1. 
We have also performed the simulation with more extreme data, by drawing the inputs and 
outputs only from the endpoints of their uncertainty intervals. This yielded similar results. 
Other experiments, where we used an ellipsoidal uncertainty region instead of the Bertsimas 
and Sim uncertainty region, or where we used the nominal objectives (based on x and y) 
but kept the uncertain constraints, also yielded similar results. 


6 Conclusions 

We have shown how RO can be applied to FPs as a method to deal with uncertain data. 
The method has been tested on three problems. In all three examples, we observe that 
the nominal solution, which is obtained by solving the deterministic problem, is severely 
affected by uncertainty. Surprisingly, this also holds for the robust solution, and in none 
of the examples the robust solution offers a signihcant improvement; even when comparing 
worst case performance. 

The hrst question that arises, is why the nominal solution performs so well. We try 
to answer this question for the mean-variance optimization problem, and note that the 
explanation for the multi-item newsvendor problem is similar. For a given solution, the worst 


17 



case for the mean is when the probability vector is a unit vector, that assigns unit weight 
to the scenario with lowest return. For the variance, the worst case is when the scenarios 
with the lowest and highest return each occur with probability 0.5. For a robust solution 
w.r.t. the mean value, the scenario with lowest return should be optimized, whereas for the 
variance, the returns in the scenarios with lowest and highest return should be close to each 
other. The nominal solution simultaneously maximizes the expected value and minimizes 
the variance. While not identical to the robust objective, it contains some aspects of it. For 
example, the mean is a weighted sum that contains the return for the scenario with lowest 
return. 

The second question that arises, is why there is a realization of the uncertain param¬ 
eters in Sections 5.1 and 5.2, for which no solution can outperform the robust solution; 
even if the former is optimized as if the realization of the uncertain parameters are known 
beforehand. This turns out to be due to Sion’s minimax theorem (Sion, 1958). The as¬ 
sumptions (a)-(g) ensure that f{ao,x)/g{ao,x) is quasi-convex in x (for fixed ao), quasi¬ 
concave in ao (for fixed x), and continuous, that the uncertainty set is compact and convex, 
and that the feasible set for x, say X, is convex. Therefore, by Sion’s minimax theorem, 
f{ao,x)/g{ao,x) = fiao,x)/g{ao,x). This no longer 

holds when there is uncertainty in the constraints, since the feasible region X changes when 
the values for the uncertain parameters are known. 

So, the robust solution is good in the sense that it cannot be improved in the worst case, 
even if the values of the uncertain parameters are known beforehand. On the other hand, 
the nominal solution performs well, at least in the examples studied. It shall be interesting 
to see the difference in real-life examples, especially with uncertainty in the constraints. 
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A The Importance of Convexity Conditions 


We provide a short example to stress the importance of the convexity/concavity conditions 
on / and g. The second numerical example by Lin and Sheu (2005) is: 


min max 

a;eK’* aG[0,l] 


a^XiX2 + + ax^ ^ ^ r ^ 

—-—7-^^ — —5-7- s.t. 0.5 < Xi <5 

5(a — lYx\ + 2x2 + 4aa;3 


i = 1,2,3. 


This problem does not satisfy the convexity/concavity conditions from Section 4. Lin and 
Sheu claim that x = (0.5 1.5 0.5) and a = 0 is optimal with a value of 0.21 (reported as 

—0.21), but X = (0.5 5 0.5) and a = 1 is a better solution (maybe still not optimal) since 

the corresponding value is 0.06. 


B On the Result by Kaul et al. (1986) 

This appendix shows a mistake in the paper by Kaul et al. (1986). Essentially, they formulate 
the dual of: 

b + 

min a s.t. ^ < a, V( 6 o, b) e Ui x W 2 , V(co, c) & UrX W 4 , Ax < d. 

aGR+,a;GR!;: Cq + C X ~ ’ ^ ^ V u, / .5 4, _ 

Note that x is non-negative. In their Lemma 2.1, they claim that the worst case (cq, c) 
does not depend on x, and is given by Cq = miucoewslco} and c*, with components c* = 
mincGW 4 {ci}. This implicitly assumes that c* is a member of U 4 ^, which is not always true. 
The mistake becomes clear in their numerical example, where they use c* = [4; 2], which 
is not in the uncertainty set. Consequently, the proposed approach gives the wrong dual 
problem and a suboptimal solution. Our results in Section 4.2 can provide the correct dual 
problem under milder conditions on the uncertainty sets. 
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